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Abstract 

We develop a calculus for lazy functional programming based on recursion operators 
associated with data type definitions. For these operators we derive various algebraic 
laws that are useful in deriving and manipulating programs. We shall show that all 
example functions in Bird and Wadler's "Introduction to Functional Programming" can 
be expressed using these operators. 



1 Introduction 

Among the many styles and methodologies for the construction of computer programs the 
Squiggol style in our opinion deserves attention from the functional programming community. 
The overall goal of Squiggol is to calculate programs from their specification in the way a math- 
ematician calculates solutions to differential equations, or uses arithmetic to solve numerical 
problems. 

It is not hard to state, prove and use laws for well-known operations such as addition, multi- 
plication and — at the function level — composition. It is, however, quite hard to state, prove 
and use laws for arbitrarily recursively defined functions, mainly because it is difficult to refer to 
the recursion scheme in isolation. The algorithmic structure is obscured by using unstructured 
recursive definitions. We crack this problem by treating various recursion schemes as separate 
higher order functions, giving each a notation of its own independent of the ingredients with 
which it constitutes a recursively defined function. 
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This philosophy is similar in spirit to the 'structured programming' methodology for imperative 
programming. The use of arbitrary goto's is abandoned in favour of structured control flow 
primitives such as conditionals and while-loops that replace fixed patterns of goto's, so that rea- 
soning about programs becomes feasible and sometimes even elegant. For functional programs 
the question is which recursion schemes are to be chosen as a basis for a calculus of programs. 
We shall consider several recursion operators that are naturally associated with algebraic type 
definitions. A number of general theorems are proven about these operators and subsequently 
used to transform programs and prove their correctness. 

Bird and Meertens [4, 18] have identified several laws for specific data types (most notably finite 
lists) using which they calculated solutions to various programming problems. By embedding 
the calculus into a categorical framework, Bird and Meertens' work on lists can be extended 
to arbitrary, inductively defined data types [17, 12]. Recently the group of Backhouse [1] has 
extended the calculus to a relational framework, thus covering indeterminancy. 

Independently, Paterson [21] has developed a calculus of functional programs similar in contents 
but very dissimilar in appearance (like many Australian animals) to the work referred to above. 
Actually if one pricks through the syntactic differences the laws derived by Paterson are the 
same and in some cases slightly more general than those developped by the Squiggolers. 

This paper gives an extension of the theory to the context of lazy functional programming, i.e., 
for us a type is an cu-cpo and we consider only continuous functions between types (categorically, 
we are working in the category CPO). Working in the category SET as done by for example 
Malcolm [17] or Hagino [14] means that finite data types (defined as initial algebras) and infinite 
data types (defined as final co-algebras) constitute two different worlds. In that case it is not 
possible to define functions by induction (catamorphisms) that are applicable to both finite and 
infinite data types, and arbitrary recursive definitions are not allowed. Working in CPO has 
the advantage that the carriers of initial algebras and final co-algebras coincide, thus there is a 
single data type that comprises both finite and infinite elements. The price to be paid however 
is that partiality of both functions and values becomes unavoidable. 



2 The data type of lists 

We shall illustrate the recursion patterns of interest by means of the specific data type of cons- 
lists. So, the definitions given here are actually specific instances of those given in §4. Modern 
functional languages allow the definition of cons-lists over some type A by putting: 

A* ::= Nil | Cons (A||A*) 

The recursive structure of this definition is employed when writing functions £ A* — > B that 
destruct a list; these have been called catamorphisms (from the greek preposition Kcxm meaning 
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"downwards" as in "catastrophe"). Anamorphisms are functions G B — > A* (from the greek 
preposition ay a meaning "upwards" as in "anabolism" ) that generate a list of type A* from a 
seed from B. Functions of type A — > B whose call-tree has the shape of a cons-list are called 
hylomorphisms (from the Aristotelian philosophy that form and matter are one, uAocr meaning 
"dust" or "matter"). 



Catamorphisms 

Let b G B and © G A||B — > B, then a list-catamorphism h G A* — > B is a function of the 
following form: 

hNil = b (1) 
h (Cons (a, as)) = a® (h as) 

In the notation of Bird&Wadler [5] one would write h = foldr b (©). We write catamorphisms 
by wrapping the relevant constituents between so called banana brackets: 

H = flb,©D (2) 

Countless list processing functions are readily recognizable as catamorphisms, for example 
length G A* — > Num., or filter p G A* — > A*, with p G A — > bool. 

length = (|0, ©0 where a © n = 1 + n 
filter p = (Nil,©) 

where a © as = Cons (a, as), p a 
= as, -p a 

Separating the recursion pattern for catamorphisms (_|) from its ingredients b and ffi makes it 
feasible to reason about catamorphic programs in an algebraic way. For example the Fusion 
Law for catamorphisms over lists reads: 

f o (b,ffi|) = (|c,(g>) 4= f b = c A f (a © as) = a (8) (f as) 

Without special notation pinpointing catas, such as (_|) or foldr, we would be forced to for- 
mulate the fusion law as follows. 



Let h, g be given by 

h. Nil = b g Nil = c 

h (Cons (a, as)) = a© (has) g (Cons (a, as)) = a8 (g as) 

then foh=g if f b = c and f (a © as) = a <8> (f as). 
A clumsy way of stating such a simple algebraic property. 
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Anamorphisms 



Given a predicate pGB^ bool and a function g G B — > A||B, a list-anamorphism h G B — > 
A* is defined as: 

h b = Nil, p b (3) 

= Cons (a, Kb'), otherwise 
where (a, b') = g b 

Anamorphisms are not well-known in the functional programming folklore, they are called 
unfold by Bird&Wadler, who spend only few words on them. We denote anamorphisms 
by wrapping the relevant ingredients between concave lenses: 

H = Rg.pJ (4) 

Many important list-valued functions are anamorphisms; for example zip G A*||B* — > (A||B)* 
which 'zips' a pair of lists into a list of pairs. 

^p = [(g,p3 

p (as, bs) = (as = Nil) V (bs = Nil) 
g (Cons (a, as), Cons (b, bs)) = ((a, b), (as, bs)) 

Another anamorphism is iterate f which given a, constructs the infinite list of iterated appli- 
cations of f to a. 

iterate f = [(g, false*] where g a = (a, f a) 
We use c* to denote the constant function Ax.c. 

Given f G A — > B, the map function f* G A* — > B* applies f to every element in a given list. 

f*Nil = Nil 
f*(Cons (a, as)) = Cons (f a, f*as) 

Since a list appears at both sides of its type, we might suspect that map can be written 
both as a catamorphism and as an anamorphisms. Indeed this is the case. As catamorphism: 
f* = (Nil,©) where a © bs = Cons (f a,bs), and as anamorphism f* = [(g,p)] where 
p as = (as = Nil) and g (Cons (a, as)) = (f a, as). 

Hylomorphisms 

A recursive function h G A — > C whose call-tree is isomorphic to a cons-list, i.e., a linear 
recursive function, is called a hylomorphism. Let c G C and © G B||C — » C and g G A — > B||A 
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and p G A — > bool then these determine the hylomorphism h 

tia = c, pa (5) 

= b®(h a'), otherwise 
where (b, a') = g a 

This is exactly the same structure as an anamorphism except that Nil has been replaced by c 
and Cons by ©. We write hylomorphisms by wrapping the relevant parts into envelopes. 

h = [(c,©),(g,p)] (6) 

A hylomorphism corresponds to the composition of an anamorphism that builds the call-tree as 
an explicit data structure and a catamorphism that reduces this data object into the required 
value. 

I(c,©),(g,p)] = (\c,(£\) o |g,p] 
A proof of this equality will be given in §15. 

An archetypical hylomorphism is the factorial function: 

fox = [(1,x),(g,p)] 
p n = n = 0 
g (1 + n) = (1 +n,n] 



Paramorphisms 

The hylomorphism definition of the factorial maybe correct but is unsatisfactory from a theoretic 
point of view since it is not inductively defined on the data type num ::= 0 | 1 +num. There 
is however no 'simple' cp such that fac = flcpD. The problem with the factorial is that it "eats 
its argument and keeps it too" [27], the brute force catamorphic solution would therefore have 
fac' return a pair (n, n!) to be able to compute (n + 1 )!. 

Paramorphisms were investigated by Meertens [19] to cover this pattern of primitive recursion. 
For type num a paramorphism is a function h of the form: 

hO = b (7) 
h (1 +n) = n© (tin) 

For lists a paramorphism is a function h of the form: 

h Nil = b 
h (Cons (a, as)) = a© (as, has) 
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We write paramorphisms by wrapping the relevant constituents in barbed wire h = (b, ©), thus 
we may write fac = {1 , ©) where u©m=(1+n)xm. The function tails G A* — > A**, 
which gives the list of all tail segments of a given list is defined by the paramorphism tails = 
{Cons (Nil, Nil),©) where a © (as, tls) = Cons (Cons (a, as),tls). 



3 Algebraic data types 

In the preceding section we have given specific notations for some recursion patterns in connec- 
tion with the particular type of cons-lists. In order to define the notions of cata-, ana-, hylo- 
and paramorphism for arbitrary data types, we now present a generic theory of data types and 
functions on them. For this we consider a recursive data type (also called 'algebraic' data type 
in Miranda) to be defined as the least fixed point of a functor 1 . 

Functors 

A bifunctor f is a binary operation taking types into types and functions into functions such 
that if f G A — > B and g G C — > D then ffgGAfC — > B f D, and which preserves identities 
and composition: 

id f id = id 
ffgohfj = (foh)t(goj) 

Bifunctors are denoted by f, $,§,.. . 

A monofunctor is a unary type operation f, which is also an operation on functions, f G (A — > 
B) — > (Af — > Bf) that preserves the identity and composition. We use f,g,... to denote 
monofunctors. In view of the notation A* we write the application of a functor as a postfix: 
Af. In §5 we will show that * is a functor indeed. 

The data types found in all current functional languages can be defined by using the following 
basic functors. 

Product The (lazy) product D||D' of two types D and D' and its operation || on functions 
are defined as: 

D||D' = {(d,d') | dG D,d' G D'} 

1 We give the definitions of various concepts of category theory only for the special case of the category 
CPO. Also 'functors' are really endo-functors, and so on. 
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(f||g) (x,x') = (f x,gx') 
Closely related to the functor | are the projection and tupling combinators: 

fi[x,y) = x 

(f a g) x = (f x, g x) 

Using 7t, 7t and a we can express f||g as f||g = (f o h) a (g o tc). We can also define a using 
|| and the doubling combinator A x = (x,x), since f a g = f||g □ A. 



Sum The sum D D' of D and D' and the operation on functions are defined as: 

D|D' = ({0}||D)U({1}||D')U{_L} 

(flg)-L = JL 

(f |g) (0,x) = (0,f x) 
(f |g) (1,x') = (1,gx') 

The arbitrarily chosen numbers 0 and 1 are used to 'tag' the values of the two summands so 
that they can be distinguished. Closely related to the functor | are the injection and selection 
combinators: 

ix = (0,x) 

= H.y) 
(fvgU = JL 
(f v g) (0,x) = fx 
(f v g) ihv) = g v 

with which we can write f | g = (i o f ) v ° g)- Using V which removes the tags from its 
argument, V _L = _L and V (i, x) = x, we can define fvg = Vof|g. 



Arrow The operation — > that forms the function space D — > D' of continuous functions from 
D to D', has as action on functions the 'wrapping' function: 

(f — > g) h = gohof 

Often we will use the alternative notation (g <— f ) h = g o h o f , where we have swapped the 
arrow already so that upon application the arguments need not be moved, thus localizing the 
changes occurring during calculations. The functional (f f- g) h = f o Hf o g wraps its F-ed 
argument between f and g. 
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Closely related to the — > are the combinators: 



curry f x y = f (x,y) 

uncurry f (x,tj) = f x y 

eval (f , x) = fx 

Note that — > is contra-variant in its first argument, i.e. (f — ) g) o (H — > j) = (H o f) — > (g c )). 



Identity, Constants The identity functor i is defined on types as Di = D and on functions 
as fi = f. Any type D induces a functor with the same name D, whose operation on objects 
is given by CD = D, and on functions fD = id- 



Lifting For mono-functors f, g and bi-functor f we define the mono-functors fg and f|g by 

x(fg) = (xf)g 
x(f|g) = (xf) f (xg) 

for both types and functions x. 

In view of the first equation we need not write parenthesis in xfg. Notice that in (f|g) the 
bi-functor f is 'lifted' to act on functors rather than on objects; (f|g) is itself a mono-functor. 



Sectioning Analogous to the sectioning of binary operators, (a©) b = affib and (©b) a = 
a © b we define sectioning of bi-functors f; 

(Af) = Afi 
(ff) = ft id 

hence B(At) = A t B and f (At) = id f f . Similarly we can define sectioning of t in its second 
argument, i.e. (tB) and (tf). 

It is not too difficult to verify the following two properties of sectioned functors: 

(ft)og(At) = g(Bt)o(ft) forallfGA^B (8) 
(ff)o(gt) = ((fog)t) (9) 

Taking f t g = g — > f , thus (ft) = (fo) gives some nice laws for function composition. 
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Laws for the basic combinators 



There are various equations involving the above combinators, we state nothing but a few of 
these. In parsing an expression function composition has least binding power while || binds 
stronger than |. 



ho f||g 
7T o f a g 
ho f||g 
7t o f a g 
(ft o h) a [h o h) 
h Ah 
f||g o h a j 
f a g o h 

f||fl = H||j 
f a g = h. a j 



f o h 
f 

g o h 
g 

h 
id 

(f o h) a (g 
(f o h) a (g 
f = HA g = 
f = HA g = 



> h) 

) 

) 



(h o i 



f | g o i 

f v g ° i 
f | g o { 

f v g ° i 
v (Hoi) 

f v g o h | j 
fog v h 

f I g = H | j 
f v g = H v j 



A nice law relating a and v is the abides law: 

(f a g) v (Ha j) 



i o f 
f 

x o g 

g 

h. 4= h strict 
id 

(f o h) v (g ° j) 
(f°g) v(foh) 

f=HAg=j 
f = HAg =j 



f strict 



(f v H) a (g v j) 



(10) 



Varia 

The one element type is denoted 1 and can be used to model constants of type A by nullary 
functions of type 1 — > A. The only member of 1 called void is denoted by (). 

In some examples we use for a given predicate p G A — > bool, the function: 

p? G A -> A | A 
p? a = _L, p a = _L 

= i a, p a = true 
= x a, p a = false 

thus f v g ° p? models the familiar conditional if p then f else g fi. The function VOID 
maps its argument to void: VOID x = (). Some laws that hold for these functions are: 

VOID of = VOID 

P?oX = x|xo(p 0 x)? 

In order to make recursion explicit, we use the operator u G (A — > A) — > A defined as: 

u f = x where x = f x 
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We assume that recursion (like x = f x) is well defined in the meta-language. 

Let f,g be functors and cp A G Af -> Ag for any type A. Such a <p is called a polymorphic 
function. A natural transformation is a family of functions (p A (omitting subscripts whenever 
possible) such that: 

Vf : f e A — > B : cp B o fF = fc o cp A (11) 

As a convenient shorthand for (11) we use (p 6 f ^ g to denote that <p is a natural trans- 
formation. The "Theorems For Free!" theorem of Wadler, deBruin and Reynolds [28, 9, 22] 
states that any function definable in the polymorphic A-calculus is a natural transformation. If 
cp is defined using u, one can only conclude that (11) holds for strict f. 

Recursive types 

After all this stuff on functors we have finally armed ourselves sufficiently to abstract from the 
peculiarities of cons-lists, and formalize recursively defined data types in general. 

Let f be a monofunctor whose operation of functions is continuous, i.e., all monofunctors 
defined using the above basic functors or any of the map-functors introduced in §5. Then 
there exists a type L and two strict functions in F <G Lf — > L and out F £ L — > Lf (omitting 
subscripts whenever possible) which are each others inverse and even id = u,(in A out) 
[6, 23, 16, 24, 30, 12]. We let u,f denote the pair (L,iu) and say that it is "the least fixed 
point of f" . Since in and out are each others inverses we have that Lf is isomorphic to L, and 
indeed L is — upto isomorphism — a fixed point of f. 

For example taking Xl = 1 A||X, we have that (A*, in) = ux defines the data type of cons- 
lists over A for any type A. If we put Nil = in o i £ 1 — > A* and Cons = in o i £ A|| A* — > 
A*, we get the more familiar (A*, Nil v Cons) = lll. Another example of data types, binary 
trees with leaves of type A results from taking the least fixed point of Xt = 1 A X||X. 
Backward lists with elements of type A, or snoc lists as they are sometimes called, are the 
least fixed point of Xl = 1 | X||A. Natural numbers are specified as the least fixed point of 
Xn = 1 I X. 



4 Recursion Schemes 

Now that we have given a generic way of defining recursive data types, we can define cata-, 
ana-, hylo- and paramorphisms over arbitrary data types. Let (L,in) = llf, <p £ Af — > A, \\> £ 
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A — > Af, E, e (A||L)f — > A then we define 

CcpD F = H(cpAout) (12) 
= u.(inA^) (13) 

[<p,i|)] F = H(cpA^) (14) 
{^} F = ^(Af. £,o (idAf)Foout) (15) 

When no confusion can arise we omit the f subscripts. 

Definition (13) agrees with the definition given in §2; where we wrote (|e,(B|) we now write 

Definition (14) agrees with the informal one given earlier on; the notation |g,p| of §2 now 
becomes [((VOID g) □ p?|. 

Definition (15) agrees with the earlier one in the sense that taking <p = c* v © and ij) = 
(VOID | g) o p? makes [(c',0), (g,p)l equal to [<p,\|>]. 

Definition (15) agrees with the description of paramorphisms as given in §2 in the sense that 
{b,©} equals {b* v (©)} here. 



Program Calculation Laws 

Rather than letting the programmer use explicit recursion, we encourage the use of the above 
fixed recursion patterns by providing a shopping list of laws that hold for these patterns. For 
each O-morphism, with O £ {cata, ana, para], we give an evaluation rule, which shows how 
such a morphism can be evaluated, a Uniqueness Property, a canned induction proof for a given 
function to be a O-morphism, and a fusion law, which shows when the composition of some 
function with an O-morphism is again an O-morphism. All these laws can be proved by mere 
equational reasoning using the following properties of general recursive functions. The first one 
is a 'free theorem' for the fixed point operator u, £ (A — > A) — > A 

f (u,g) = u,h 4= f strict A f o g = H o f (16) 

Theorem (16) appears under different names in many places 2 [20, 8, 2, 15, 7, 25, 13, 31]. In 
this paper it will be called fixed point fusion. 

The strictness condition in (16) can sometimes be relaxed by using 

f (ng) =f (ng') <= fl = f'l Afog=hof Af'cg' = liof' (17) 

2 Other references are welcome. 
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Fixed point induction over the predicate P(g, g') = f g = f g' will prove (17). 

For hylomorphisms we prove that they can be split into an ana- and a catamorphism and show 
how computation may be shifted within a hylomorphism. A number of derived laws show 
the relation between certain cata- and anamorphisms. These laws are not valid in SET. The 
hylomorphism laws follow from the following theorem: 

u.(f A g) o u,(H A j) = u,(f A j) 4= goh = id (18) 

Catamorphisms 

Evaluation rule The evaluation rule for catamorphisms follows from the fixed point property 

x = uf => x = f x: 

flcpDoin = <p c (] cp D l (CataEval) 

It states how to evaluate an application of flcpD to an arbitrary element of L (returned by the 
constructor in); namely, apply flcpD recursively to the argument of in and then cp to the result. 

For cons lists (A*, Nil v Cons) = u.l where Xl = 1 A||X and fi_ = id | id||f with 
catamorphism (]c v ©D the evaluation rule reads: 

flc v ©K o Nil = c (19) 
Oc v ©B ° Cons = © o id|| (|c v ©I) (20) 

i.e. the variable free formulation of (1). Notice that the constructors, here Nil v Cons are 
used for parameter pattern matching. 

UP for catamorphisms The Uniqueness Property can be used to prove the equality of two 
functions without using induction explicitly. 

f=fl<pD = f .1 = flcpD .1 A f oin = cpof L (CataUP) 

A typical induction proof for showing f = flcpD takes the following steps. Check the induction 
base: f o _L = (jcp^ ° -L. Assuming the induction hypothesis fi. = (] cp 0 1_ proceed by calculating: 

f o in = . . . = cp o f l 
= induction hypothesis 

cp o Q<p[)L 
= evaluation rule (CataEval) 

M ° in- 
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to conclude that f = (|(p|). The schematic set-up of such a proof is done once and for all, and 
built into law (CataUP). We are thus saved from the standard ritual steps; the last two lines in 
the above calculation, plus the declaration that 'by induction' the proof is complete. 

The => part of the proof for (CataUP) follows directly from the evaluation rule for cata- 
morphisms. For the 4= part we use the fixed point fusion theorem (17) with f := (fo), 

g := g' := in f 1 - out and f := fl<p[). This gives us f o u(iu out) = (](p|) c u(in out) 

and since u(iu out) = id we are done. 

Fusion law for catamorphisms The Fusion Law for catamorphisms can be used to trans- 
form the composition of a function with a catamorphism into a single catamorphism, so that 
intermediate values can be avoided. Sometimes the law is used the other way around, i.e. to 
split a function, in order to allow for subsequent optimizations. 

f o fl<p[) = flipD f o_L = o_L A f o <p =i|> ofL (CataFusion) 

The fusion law can be proved using fixed point fusion theorem (17) with f := (fo), g := tp f 1 
out, g' := in out and f := (fli|>Do). 

A slight variation of the fusion law is to replace the condition f o _L = (|ij>[) c _L by f o _L = _L, 
i.e. f is strict. 

f ° fl<pD = d^l) 4= f strict A f o cp = i|) o f l (CataFusion') 

This law follows from (16). In actual calculations this latter law is more valuable as its appli- 
cability conditions are on the whole easier to check. 

Injective functions are catamorphisms Let f e A — > B be a strict function with left-inverse 
g, then for any <p 6 Af — > A we have 

f o fJcpD = flf o <p o g F |) 4= f strict A g o f = id (21) 

Taking <p = in we immediatly get that any strict injective function can be written as a 
catamorphism. 

f=flfoinogFD F 4= f strict A g o f = id (22) 
Using this latter result we can write out in terms of in since out = flout o in o iui.|) = flvruQ. 
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Catamorphisms preserve strictness The given laws for catamorphisms all demonstrate the 
importance of strictness, or generally of the behaviour of a function with respect to _L. The 
following "poor man's strictness analyser" for that reason can often be put into good use. 

uJ c ± = _L 4= Vf :: F f o _L = ± (23) 

The proof of (23) is by fixed point induction over P(F) = F o _L = _L. 

Specifically for catamorphisms we have 

QcpD L o _L = _L = cp o _L = ± 

if l is strictness preserving. The part of the proof directly follows from (23) and the definition 
of catamorphisms. The other way around is shown as follows 

1 

= premise 

M°-L 

= in o 1 = 1 

d<pD oiucl 
= evaluation rule 

cp o (]cp[)L o ± 
= l preserves strictness 

cp o ± 

Examples 

Unfold-Fold Many transformations usually accomplished by the unfold-simplify-fold tech- 
nique can be restated using fusion. Let (Num.*, Nil v Cons) = ul, where Xl = 1 Num||X 
and f l = id | id||f be the type of lists of natural numbers. Using fusion we derive an efficient 
version of sum o squares where sum = (|0* v +[) and squares = flNil v (Cons o SQ||id)|). 
Since sum is strict we just start calculating aiming at the discovery of a that satisfies the 
condition of (CataFusion'). 

sum o Nil v (Cons o S||id) 
(sum o Nil) v (sum o Cons c S Q Hid) 
Nil v ((+) ° id||sum o SQ||id) 
Nil v ((+) o SQ | |id o id||sum) 
Nil v ((+) ° SQ||id) o sumL 
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and conclude that sum o squares = QNil v ((+) ° SQ||id)|). 



A slightly more complicated problem is to derive a one-pass solution for 



average 



DIV o sum a length 



Using the tupling lemma of Fokkinga [10] 



C(cp o tcl) a (tJj o 7tL)[) 



a simple calculation shows that average = DIV c Q(0* v (+) ° id| 1 7t) a (0* v (+1 ) ° TtD- 

Accumulating Arguments An important item in the functional programmer's bag of tricks 
is the technique of accumulating arguments where an extra parameter is added to a function to 
accumulate the result of the computation. Though stated here in terms of catamorphisms over 
cons-lists, the same technique is applicable to other data types and other kind of morphisms as 
well. 



Theorem (24) follows from the fusion law by taking Accu o (c* v ©D = d(c©)* v ©I) with 
Accu a b = a <g> b. 

Given the naive quadratic definition of reverse G A* — > A* as a catamorphism QNil" v ©0 
where a© as = as -H- (Cons (a, Nil)), we can derive a linear time algorithm by instantiating 
(24) with © := -H- and © := Cons to get a function which accumulates the list being reversed 
as an additional argument: did v ©D where (a© as) bs = as (Cons (a,bs)). Here -H- 
is the function that appends two lists, defined as as -H- bs = did" v ©D is bs where 
a © f bs = Cons (a, f bs). 

In general catamorphisms of higher type L — > (I — > S) form an interesting class by themselves 
as they correspond to attribute grammars [11]. 



Anamorphisms 

Evaluation rule The evaluation rule for anamorphisms is given by: 



(c* v ©D I = d(c©)* v ©D I where (a © f) b = f (a © b) 

a©^ (D = a A _L © a = _L A (affib)©c = b©(a©c) 



(24) 



out o [(xb]] 



Mloi|> 



(AnaEval) 
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It says what the result of an arbitrary application of [["4>] looks like: the constituents produced 
by applying out can equivalently be obtained by first applying and then applying [["4>j]- 
recursively to the result. 

Anamorphisms are real old fusspots to explain. To instantiate (AnaEval) for cons list we define: 

hd = _L v 7t o out 

tl = _L v 7T o OUt 

is_nil = true* v false* □ out 

Assuming that f = [(VOID | (h a t) □ p?)] we find after a little calculation that: 

is_n.il o f = p 

hd o f = h 4= ^p 
tl o f = t 4= -p 

which corresponds to the characterization of unfold given by Bird and Wadler [5] on page 
173. 

UP for anamorphisms The UP for anamorphisms is slightly simpler than the one for cata- 
morphisms, since the base case does not have to be checked. 

f=|[cp| = outof = fLo(p (AnaUP) 

To prove it we can use fixed point fusion theorem (16) with f := (of), g := in f 1 - out and 
h := in ij). This gives us u(in out) of = u(in 4>) and again since u(in f 1 - out) = 
id we are done. 

Fusion law for anamorphisms The strictness requirement that was needed for catamor- 
phisms can be dropped in the anamorphism case. The dual condition of f o _L = _L for 
strictness is _L o f = ± which is vacuously true. 

|cpj o f = <^= (p 0 f = fLoi|) (AnaFusion) 

This law can be proved by fixed point fusion theorem (16) with f := (of), g := in cp and 
h := in ij>. 
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Any surjective function is an anamorphism The results (21) and (22) can be dualized 
for anamorphisms. Let f £ B — > A a surjective function with right-inverse g, then for any 

-uO e A — > Ai_ we have 



of = [(gLc^cfJ 4= fog=id (25) 

since \\) o f = f l o (gi_ o "4> o f). The special case where equals out yields that any surjective 
function can be written as an anamorphism. 

f = Iql o out o f)] L fog=id (26) 

As in has right-inverse out, we can express in using out by in = [(outL □ out o in] = 

Examples 

Reformulated in the lense notation, the function iterate f becomes: 

iterate f = [(i o id a f ] 

We have |i o id a f] = [(VOID id a f □ false*?] (= [(id a f, false*] in the notation of 
section 2). 

Another useful list-processing function is takewhile p which selects the longest initial segment 
of a list all whose elements satisfy p. In conventional notation: 

takewhile p Nil = Nil 
takewhile p (Cons a as) = Nil, -p a 

= Cons a (takewhile p as), otherwise 

The anamorphism definition may look a little daunting at first: 

takewhile p = [iy (VOID | id o (-p o ft)?) o out] 

The function f while p contains all repeated applications of f as long as predicate p holds: 

f while p = takewhile p o iterate f 

Using the fusion law (after a rather long calculation) we can show that f while p = [(VOID 
(id a f ) o -p?)] . 
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Hylomorphisms 



Splitting Hylomorphisms In order to prove that a hylomorphism can be split into an anamor- 
phism followed by a catamorphism 

l<p,M = M° m (HyloSplit) 
we can use the total fusion theorem (18). 

Shifting law Hylomorphisms are nice since their decomposability into a cata- and an anamor- 
phism allows us to use the respective fusion laws to shift computation in or out of a hylomor- 
phism. The following shifting law shows how computations can be shifted within a hylomor- 
phism. 

[<p o £,,\|)] L = [<p,£,oi|)] M £, G i_ — > m (HyloShift) 
The proof of this theorem is straightforward. 

[<p o £,,\|>] L 
= definition hylo 

(J.(Af .(p o^ofLolJ)) 

= £, G L — > M 

u.(Af .cp o fM o E, o "4>) 
= definition hylo 

[(p,£,oiJj] M 

An admittedly humbug example of (HyloShift) shows how left linear recursive functions can be 
transformed into right linear recursive functions. Let f l = id | f ||id and f r = id id||f define 
the functors which express left respectively right linear recursion, then if x © y = tj © x we 
have 

[c v ©,f (Ha t) 0 p?] L 
= [c v © o SWAP, f | (h a t) o p?] L 

SWAP G L — > R 

[c v ©,SWAPo f | (Ha t) o P ?] R 
~ [c v ©,f (t a h) op?] R 

where SWAP = id [ii a ft). 
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Relating cata- and anamorphisms 



From the splitting and shifting law (HyloShift), (HyloSplit) and the fact that (|(p|) = [(p,out] 
and = [in, oJj] we can derive a number of interesting laws which relate cata- and anamor- 
phisms with each other. 

Cin M o (p|) L = [((p o out L | M 4= <p 6 l — > m (27) 
Using this law we can easily show that 

(I<P°iMl = Mm 0 I"4» ° out L j M 4= ^ G l m (28) 

= M M 0 d ln M °^D L <= ^ G l ^ m (29) 

|(P»^] M = dvn-M o (p) L o |\p| L 4= (pGL^M (30) 
= |<p o OUt L )] M o |^3 L 4= (pGL->M (31) 

This set of laws will be used in §5. 

From the total fusion theorem (18) we can derive: 

M L ° (]cpD L = id 4= o cp =id (32) 

Example: Reflecting binary trees 

The type of binary trees with leaves of type A is given by {tree A, in) = [il where Xl = 
1 | A | X||X and fx = id | id | g||g. Reflecting a binary tree can be defined by: reflect = 
(]in o SWAP) where SWAP = id | id | [tc a 7t). A simple calculation proves that reflect o 
reflect = id. 

reflect o reflect 

SWAP o f l = f l o SWAP 
| SWAP o out)] o din o SWAPfi 

SWAP o out o in o SWAP = id 

id 

Paramorphisms 



The evaluation rule for paramorphisms is 

{cp} o in = 



cp o (id a {(p))t 
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(ParaEval) 



The UP for paramorphisms is similar to that of catamorphisms: 

f = (cp) = f o _L = {cp} o ± A f o in = (p o (id a f)i_ 
The fusion law for paramorphisms reads 

f 0 {(p) = {\|j} 4= f strict A f o cp = ajj o (id||f)L 
Any function f (of the right type of course!) is a paramorphism. 

f = {f o in o 7Ti_) 
The usefulness of this theorem can be read from its proof. 

(f o in o tcl} 
= definition (15) 

p(Ag.f o in o 7tL o (id a g)i_ o out) 
= functor calculus 

u.(Ag.f o in o out) 

f 

Example: composing paramorphisms from ana- and catamorphisms 

A nice result is that any paramorphism can be written as the composition of a cata- and an 
anamorphism. Let (L,in) = u.l be given, then define 

Xm = (L||X)l 
Km = (id||h.)i_ 
(M,IN) = u.m 

For natural numbers we get Xm = (Num||X)L = 1 | Num||X, i.e. (Num*,in) = u,m, which 
is the type of lists of natural numbers. 

Now define preds € L — > M as follows: 

preds = [ALoout L ] M 

For the naturals we get preds = [(id | A o out], that is given a natural number N = n, the 
expression preds N yields the list [n — 1 , . . . , 0]. 

Using preds we start calculating: 



(ParaUP) 
(ParaFusion) 
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_ Mm ° P reds 

~ Mm 0 l Al 0 out L] M 

(J.(Af.(p o f M o Al o OUtL) 

|j.(Af.cp o (id||f)L o (Id a ld)L o out L ) 
|j.(Af.cp o (Id A f)L o out L ) 

" Ml 

Thus {cp} L = C cp D m 0 P T eds. Since pN[) M = Id we immediately get preds = {IN) L . 



5 Parametrized Types 

In §2 we have defined for f e A — > B, the map function f* 6 A* — > B*. Two laws for * are 
Id* = id and (f o g)* = f * 0 g*. These two laws precisely state that * is a functor. Another 
characteristic property of map is that it leaves the 'shape' of its argument unchanged. It turns 
out that any parametrized data type comes equipped with such a map functor. A parametrized 
type is a type defined as the least fixed point of a sectioned bifunctor. Contrary to Malcolms 
approach [17] map can be defined both as a catamorphism and as an anamorphism. 



Maps 

Let f be a bi-functor, then we define the functor * on objects A as the parametrized type 
A* where (A*, in) = u.(Af), and on functions f e A — > B as: 

f* = (|ino(ft)|) (At) (33) 

Since (ff) 6 (Af) — ■> (Bt), from (27) we immediately get an alternative version of f* as an 
anamorphism: 

f* = | (ff) oout)] (Bt) 
Functoriality of f* is calculated as follows: 

f* o g* 
= definition * 

flino(ft)Doflino(gt)D 
= (29) 
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flmo(ft)o(gt)tt 

= (9) 

flino((fog)t)tt 
= definition * 

(f o g)* 

Maps are shape preserving. Define SHAPE = VOID* then SHAPE □ f* = VOID □ f* = 
SHAPE. 

For cons-list (A*, Nil v Cons) = u(Af ) with A f X = 1 | A||X and f f g = id | f ||g we get 
f * = [[f f id o out] . From the UP for catas we find that this conforms to the usual definition 
of map. 

f*„Nil = Nil 
f* o Cons = Cons o f ||f* 
Other important laws for maps are factorization [26] and promotion [4]. 

M°f* = fl<P o (ff)D (34) 
f*oM = |(ff)o^3 (35) 

(M°"f* = g o Cxi) 4= 9 ° X = <P ° ft g A g strict (36) 
f*o [[^3 = IQ og 4= £,og = ftgoil) (37) 

Now we know that * is a functor, we can recognize that in <G if* — -> * and out G * — > if* are 
natural transformations. 

f * o in = in o f f f * 
out of* = f f f * o out 



Iterate promotion 

Recall the function iterate f = [[i o id a f J , the following law turns an 0{\ e ) algorithm into 
an 0{\) algorithm, under the assumption that evaluating g o f n takes n steps. 

g* o iterate f = iterate h. o g g 0 f = hog (38) 

Law (38) is an immediate consequence of the promotion law for anamorphisms (37). 

Interestingly we may also define iterate as a cyclic list: 

iterate f x = u(Axs.Cons (x, f*xs)) 
and use fixed point fusion to prove (38). 
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Map- Reduce factorization 



A data type (A*, In) = u(Af) with A f X = A | Xf is called a free F-type over A. For a free 
type we can always write strict catas dtp D as (f v <pD by taking f = a|) o x and cp = \\> o 1 For 
f* we get 

f * = (|in o f | id| 

= (|tau | join o f | id|) 
= i\tau o f v join|) 
where tau = in □ i and join = in □ 1 

If we define the reduction with cp as 

cp/ = (|id v cpD (39) 

the factorization law (34) shows that catamorphisms on a free type can be factored into a map 
followed by a reduce. 

_ flf vcpD 

(id v <P ° f I id[) 
(id v cpD o f* 
cp/ o f* 

The fact that tau and join are natural transformations give evaluation rules for f* and cp/ on 
free types. 

f* o tau = tau of cp/ o tau = id 

f* o join = join o f*F cp/ o join = cp o (cp/)F 

Early Squiggol was based completely on map-reduce factorization. Some of these laws from 
the good old days; reduce promotion and map promotion. 

cp/ o join/ = cp/ o (cp/)* 
f* o join/ = join/ o f** 



Monads 

Any free type gives rise to a monad [17], in the above notation, (*,tau G i — > *,join/ e 
** *) since: 

join/ o tau = id 
join/ o tau* = id 
join/ o join/ = join/ o join/* 
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Wadler [29] gives a thorough discussion on the concepts of monads and their use in functional 
programming. 



6 Conclusion 

We have considered various patterns of recursive definitions, and have presented a lot of laws 
that hold for the functions so defined. Although we have illustrated the laws and the recursion 
operators with examples, the usefulness for practical program calculation might not be evident 
to every reader. Unfortunately we have not enough space here to give more elaborate examples. 

There are more aspects to program calculation than just a series of combining forms (like 
Q4.I-3 .{-).[-)-]) and laws about them. For calculating large programs one certainly needs high 
level algorithmic theorems. The work reported here provides the necessary tools to develop 
such theorems. For the theory of lists Bird [3] has started to do so, and with success. 

Another aspect of program calculation is machine assistance. Our experience — including that 
of our colleagues — shows that the size of formal manipulations is much greater than in most 
textbooks of mathematics; it may well be comparable in size to "computer algebra" as done 
in systems like MACSYMA, Maple, Mathematica etc. Fortunately, it also appears that most 
manipulations are easily automated and, moreover, that quite a few equalities depend on natural 
transformations. Thus in several cases type checking alone suffices. Clearly machine assistance 
is fruitful and does not seem to be too difficult. 

Finally we observe that category theory has provided several notions and concepts that were 
indispensable to get a clean and smooth theory; for example, the notions of functor and natural 
transformation. (While reading this paper, a category theorist may recognize several other 
notions that we silently used). Without doubt there is much more categorical knowledge that 
can be useful for program calculation; we are just at the beginning of an exciting development. 
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